NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

FluidKV: Seamlessly Bridging the Gap between Indexing Performance and Memory-Footprint on Ultra-Fast Storage

https://doi.org/10.14778/3648160.3648177

Lu, Ziyi; Cao, Qiang; Jiang, Hong; Chen, Yuxing; Yao, Jie; Pan, Anqun (February 2024, Proceedings of the VLDB Endowment)

Our extensive experiments reveal that existing key-value stores (KVSs) achieve high performance at the expense of a huge memory footprint that is often impractical or unacceptable. Even with the emerging ultra-fast byte-addressable persistent memory (PM), KVSs fall far short of delivering the high performance promised by PM's superior I/O bandwidth. To find the root causes and bridge the huge performance/memory-footprint gap, we revisit the architectural features of two representative indexing mechanisms (single-stage and multi-stage) and propose a three-stage KVS called FluidKV. FluidKV effectively consolidates these indexes by fast and seamlessly running incoming key-value request stream from the write-concurrent frontend stage to the memory-efficient backend stage across an intermediate stage. FluidKV also designs important enabling techniques, such as thread-exclusive logging, PM-friendly KV-block structures, and dual-grained indexes, to fully utilize both parallel-processing and high-bandwidth capabilities of ultra-fast storage hardware while reducing the overhead. We implemented a FluidKV prototype and evaluated it under a variety of workloads. The results show that FluidKV outperforms the state-of-the-art PM-aware KVSs, including ListDB and FlatStore with different indexes, by up to 9× and 3.9× in write and read throughput respectively, while cutting up to 90% of the DRAM footprint.
more » « less
Full Text Available
HF-LDPC: HLS-friendly QC-LDPC FPGA Decoder with High Throughput and Flexibility

https://doi.org/10.1109/ICCD58817.2023.00091

Zhang, Yifan; Cao, Qiang; Wang, Shaohua; Yao, Jie; Jiang, Hong (November 2023, IEEE)

LDPC (Low-Density Parity-Check) codes have become a cornerstone of transforming a noise-filled physical channel into a reliable and high-performance data channel in communication and storage systems. FPGA (Field-Programmable Gate Array) based LDPC hardware, especially for decoding with high complexity, is essential to realizing the high-bandwidth channel prototypes. HLS (High-Level Synthesis) is introduced to speed up the FPGA development of LDPC hardware by automatically compiling high-level abstract behavioral descriptions into RTL-level implementations, but often sub-optimally due to lacking effective low-level descriptions. To overcome this problem, this paper proposes an HLS-friendly QC-LDPC FPGA decoder architecture, HF-LDPC, that employs HLS not only to precisely characterize high-level behaviors but also to effectively optimize low-level RTL implementation, thus achieving both high throughput and flexibility. First, HF-LDPC designs a multi-unit framework with a balanced I/O-computing dataflow to adaptively match code parameters with FPGA configurations. Second, HFLDPC presents a novel fine-grained task-level pipeline with interleaved updating to eliminate stalls due to data interdependence within each updating task. HF-LDPC also presents several HLSenhanced approaches. We implement and evaluate HF-LDPC on Xilinx U50, which demonstrates that HF-LDPC outperforms existing implementations by 4× to 84× with the same parameter and linearly scales to up to 116 Gbps actual decoding throughput with high hardware efficiency.
more » « less
Full Text Available
Evolutionary history and pan-genome dynamics of strawberry (Fragaria spp.)

https://doi.org/https://doi.org/10.1073

Qiao, Qin; Edger, Patrick; Xue, Li; Qiong, La; Lu, Jie; Zhang, Yichen; Cao, Qiang; Yocca, Alan; Platts, Adrian; Knapp, Steven; et al (November 2021, Proceedings of the National Academy of Sciences of the United States of America)
null (Ed.)
Full Text Available
Improving Overall Performance of TLC SSD by Exploiting Dissimilarity of Flash Pages

https://doi.org/10.1109/TPDS.2019.2934458

Zhang, Wenhui; Cao, Qiang; Jiang, Hong; Yao, Jie (February 2020, IEEE Transactions on Parallel and Distributed Systems)

Full Text Available
A Novel Multi-Stage Forest-Based Key-Value Store for Holistic Performance Improvement

https://doi.org/10.1109/TPDS.2019.2950248

Lu, Ziyi; Cao, Qiang; Mei, Fei; Jiang, Hong; Li, Jingjun (April 2020, IEEE Transactions on Parallel and Distributed Systems)

Full Text Available
A Fast Filtering Mechanism to Improve Efficiency of Large-Scale Video Analytics

https://doi.org/10.1109/TC.2020.2970413

Zhang, Chen; Cao, Qiang; Jiang, Hong; Zhang, Wenhui; Li, Jingjun; Yao, Jie (June 2020, IEEE Transactions on Computers)

Full Text Available
Analysis of and Optimization for Write-dominated Hybrid Storage Nodes in Cloud

https://doi.org/10.1145/3357223.3362705

Liu, Shuyang; Wang, Shucheng; Cao, Qiang; Lu, Ziyi; Jiang, Hong; Yao, Jie; Dong, Yuanyuan; Yang, Puyuan (November 2019, acm symposium on cloud computing)

Full Text Available
Toward live inter-domain network services on the ExoGENI testbed

https://doi.org/10.1109/INFCOMW.2018.8407026

Yao, Yuanjun; Cao, Qiang; Farias, Rubens; Chase, Jeff; Orlikowski, Victor; Ruth, Paul; Cevik, Mert; Wang, Cong; Buraglio, Nick (April 2018, IEEE INFOCOM 2018 - IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS))

A key dimension of reproducibility in testbeds is stable performance that scales in regular and predictable ways in accordance with declarative specifications for virtual resources. We contend that reproducibility is crucial for elastic performance control in live experiments, in which testbed tenants (slices) provide services for real user traffic that varies over time. This paper gives an overview of ExoPlex, a framework for deploying network service providers (NSPs) as a basis for live inter-domain networking experiments on the ExoGENI testbed. As a motivating example, we show how to use ExoPlex to implement a virtual software-defined exchange (vSDX) as a tenant NSP. The vSDX implements security-managed interconnection of customer IP networks that peer with it via direct L2 links stitched dynamically into its slice. An elastic controller outside of the vSDX slice provisions network links and computing capacity for a scalable monitoring fabric within the tenant vSDX slice. The vSDX checks compliance of traffic flows with customer-specified interconnection policies, and blocks traffic from senders that trigger configured rules for intrusion detection in Bro security monitors. We present initial results showing the effect of resource provisioning on Bro performance within the vSDX.
more » « less
Full Text Available

Search for: All records